-
Notifications
You must be signed in to change notification settings - Fork 316
Allow no quantization during QATConfig convert #2694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2694
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit 53af27a with merge base 418593c ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
b1f35a1
to
9fbcffe
Compare
**Summary:** This commit adds back the functionality to swap `FakeQuantized*` modules back to the corresponding `torch.nn.*` without performing post-training quantization: ``` QATConfig(base_config=None, step="convert") ``` This has the exact same functionality as this deprecated config: ``` FromIntXQuantizationAwareTrainingConfig() ``` This functionality is added back since it may be useful to users who wish to save QAT trained checkpoints from models containing only `torch.nn.*` modules (not `FakeQuanitzed*` modules), e.g. when training and inference need to happen on different machines: ``` quantize_(model, QATConfig(base_ptq_config, step="prepare")) train(model) quantize_(model, QATConfig(step="convert")) torch.save(model.state_dict(), "my_checkpoint.pt") \# On a different machine model.load_state_dict(torch.load("my_checkpoint.pt")) quantize_(model, base_ptq_config) ``` **Test Plan:** ``` python test/quantization/test_qat.py -k qat_config_init python test/quantization/test_qat.py -k qat_api_convert_no_quantization ```
9fbcffe
to
53af27a
Compare
I see, this is the step="load" in AWQConfig, would it be easier to just add that |
I don't think we need a different step? This is just separating the convert step into two phases: (1) swap back to nn.Linear, and (2) quantize the model, and these can happen on different machines |
and self.base_config is None | ||
and self.weight_config is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewor14 Would it be possible to allow base_config=None and weight_config=None? This would support PARQ's activation-only quantization use case in #2743.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah let's do that separately in your PR
Summary: This commit adds back the functionality to swap
FakeQuantized*
modules back to the correspondingtorch.nn.*
without performing post-training quantization:This has the exact same functionality as this deprecated config:
This functionality is added back since it may be useful to users who wish to save QAT trained checkpoints from models containing only
torch.nn.*
modules (notFakeQuanitzed*
modules), e.g. when training and inference need to happen on different machines:Test Plan: